In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA.
translated by 谷歌翻译
在本文中,提出了一种新的方法,该方法允许基于神经网络(NN)均衡器的低复杂性发展,以缓解高速相干光学传输系统中的损伤。在这项工作中,我们提供了已应用于馈电和经常性NN设计的各种深层模型压缩方法的全面描述和比较。此外,我们评估了这些策略对每个NN均衡器的性能的影响。考虑量化,重量聚类,修剪和其他用于模型压缩的尖端策略。在这项工作中,我们提出并评估贝叶斯优化辅助压缩,其中选择了压缩的超参数以同时降低复杂性并提高性能。总之,通过使用模拟和实验数据来评估每种压缩方法的复杂性及其性能之间的权衡,以完成分析。通过利用最佳压缩方法,我们表明可以设计基于NN的均衡器,该均衡器比传统的数字背部传播(DBP)均衡器具有更好的性能,并且只有一个步骤。这是通过减少使用加权聚类和修剪算法后在NN均衡器中使用的乘数数量来完成的。此外,我们证明了基于NN的均衡器也可以实现卓越的性能,同时仍然保持与完整的电子色色散补偿块相同的复杂性。我们通过强调开放问题和现有挑战以及未来的研究方向来结束分析。
translated by 谷歌翻译
大型语言模型已经证明了能够在自然语言和编程语言文本上进行条件和生成的能力。这样的模型打开了多语言代码生成的可能性:代码生成模型是否可以将知识从一种语言推广到另一种语言?尽管当代代码生成模型可以生成语义上正确的Python代码,但对它们使用其他语言的能力知之甚少。我们通过提出Multipl-E来促进该主题的探索,这是自然语言到代码生成的第一个多语言平行基准。 Multipl-E扩展了HumaneVal基准(Chen等,2021),以支持另外18种编程语言,涵盖了一系列编程范式和受欢迎程度。我们在Multipl-E:Codex和Incoder上评估了两个最先进的代码生成模型。我们发现,在几种语言上,法典匹配,甚至超过了其在Python上的性能。在多型E中表示的编程语言范围使我们能够探索语言频率和语言功能对模型性能的影响。最后,将代码生成基准分配给新编程语言的多重方法既可扩展又可扩展。我们描述了一种通用方法,可以轻松地增加对新基准和语言的支持。
translated by 谷歌翻译
FPGA中首次实施了针对非线性补偿的经常性和前馈神经网络均衡器,其复杂度与分散均衡器的复杂度相当。我们证明,基于NN的均衡器可以胜过1个速度的DBP。
translated by 谷歌翻译
我们概述了新兴机会和挑战,以提高AI对科学发现的效用。AI为行业的独特目标与AI科学的目标创造了识别模式中的识别模式与来自数据的发现模式之间的紧张。如果我们解决了与域驱动的科学模型和数据驱动的AI学习机之间的“弥补差距”相关的根本挑战,那么我们预计这些AI模型可以改变假说发电,科学发现和科学过程本身。
translated by 谷歌翻译
我们解决了在手动操纵期间从触摸跟踪3D对象姿势的问题。具体地,我们使用基于视觉的触觉传感器来看看追踪小物体,该触觉传感器在接触点提供高维触觉图像测量。虽然事先工作依赖于有关已本地化对象的先验信息,但我们删除此要求。我们的关键识别是,一个对象由几个本地曲面修补程序组成,每个界面都足以实现可靠的对象跟踪。此外,我们可以通过提取嵌入在每个触觉图像中的局部表面正常信息在线恢复此本地补丁的几何形状。我们提出了一种新的两阶段方法。首先,我们使用图像翻译网络学习从触觉图像到曲面法线的映射。其次,我们在因子图中使用这些表面法线到两个重建本地补丁映射并使用它来推断3D对象姿势。我们展示了在唯一形状的100多个联系序列中跟踪可靠的对象跟踪,其中仿真中的四个对象和现实世界中的两个对象。补充视频:https://youtu.be/jwntc9_nh8m
translated by 谷歌翻译
社交媒体通常在选举活动中被公众使用,以表达他们对不同问题的看法。在各种社交媒体渠道中,Twitter为研究人员和政客提供了一个有效的平台,以探索有关经济和外交政策等广泛主题的公众舆论。当前的文献主要集中于分析推文的内容而无需考虑用户的性别。这项研究收集和分析了大量推文,并使用计算,人类编码和统计分析来识别2020年美国总统选举期间发布的300,000多个推文中的主题。我们的发现是基于广泛的主题,例如税收,气候变化和Covid-19-19。在主题中,女性和男性用户之间存在着显着差异,超过70%的主题。
translated by 谷歌翻译
我们解决了学习观察模型的问题,用于估计的结束到底。在部分可观察环境中运行的机器人必须使用捕捉潜在状态和观察之间的联合分布的观测模型来推断潜在的状态。该推理问题可以作为使用所有先前测量的最可能的状态序列优化的图表中的目标。前工作使用观察模型,即已知先验,或者独立于图形优化器的代理损耗培训。在本文中,我们提出了一种方法,通过在循环中使用图形优化器学习观察模型来直接优化端到端跟踪性能。然而,可能出现这种直接方法,要求推断算法完全可分辨率,这很多最先进的图表优化器不是。我们的主要洞察力是推出作为基于能源学习的问题。我们提出了一种新颖的方法,Leo,用于学习观察模型的结束,具有可能是不可差异的图优化器。 Leo在从图形后面的采样轨迹之间交替,并更新模型以将这些样本与地面真相轨迹匹配。我们建议使用增量高斯牛顿溶剂有效地生成这些样品。我们将Leo与来自两个独特任务的数据集上的基线进行比较:导航和现实世界的平面推动。我们表明Leo能够学习具有较低误差和更少样本的复杂观测模型。补充视频:https://youtu.be/yqzlupudfka
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
We present a dynamic path planning algorithm to navigate an amphibious rotor craft through a concave time-invariant obstacle field while attempting to minimize energy usage. We create a nonlinear quaternion state model that represents the rotor craft dynamics above and below the water. The 6 degree of freedom dynamics used within a layered architecture to generate motion paths for the vehicle to follow and the required control inputs. The rotor craft has a 3 dimensional map of its surroundings that is updated via limited range onboard sensor readings within the current medium (air or water). Path planning is done via PRM and D* Lite.
translated by 谷歌翻译